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Abstract 

Current analyses of genomes from numerous species show 
that the diversity of organism's functional and behavioral 
characters is not proportional to the number of genes that en- 
code the organism. We investigate the hypothesis that the 
diversity of organismal character is due to hierarchical orga- 
nization. We do this with the recently introduced model of 
the finitary process soup, which allows for a detailed mathe- 
matical and quantitative analysis of the population dynamics 
of structural complexity. Here we show that global complex- 
ity in the finitary process soup is due to the emergence of 
successively higher levels of organization, that the hierarchi- 
cal structure appears spontaneously, and that the process of 
structural innovation is facilitated by the discovery and main- 
tenance of relatively noncomplex, but general individuals in 
a population. 

Introduction 

Recent estimates have shown that the genomes of many 
species consist of a surprisingly similar number of genes de- 
spite some being markedly more sophisticated and diverse 
in their behaviors. Humans have only 30% more genes 
that the worm Caenorhabditis elegans; humans, mice, and 
rats have nearly the same number ( Lynch and Co nery, 2003| 
|Rat Genom e Sequencing Project Consortium, 2004|l. More- 
over, many of those genes serve to maintain elementary pro- 
cesses and are shared across species, which greatly reduces 
the number of genes available to account for diversity. One 
concludes that individual genes cannot directly code for the 
full array of individual functional and morphological char- 
acters of a species, as genetic determinism would have it. 
From what, then, do the sophistication and diversity of or- 
ganismal form and behavior arise? 

Here we investigate the hypothesis that these arise from 
a hierarchy of interactions between genes and between in- 
teracting gene complexes. A hierarchy of gene interactions, 
being comprised of subsets of available genes, allows for an 
exponentially larger range of functions and behaviors than 
direct gene-to-function coding. We will use a recently intro- 
duced pre-biotic evolutionary model — the finitary process 
soup — of the population dynamics of structural complex- 
ity ( |Crutchfield and Gornerup, 2006^ . Specifically, we will 



show that global complexity in the finitary process soup is 
due to the emergence of successively higher levels of orga- 
nization. Importantly, hierarchical structure appears sponta- 
neously and is facilitated by the discovery and maintenance 
of relatively noncomplex, but general individuals in a pop- 
ulation. These results, in concert with the minimal assump- 
tions and simplicity of the finitary process soup, strongly 
suggest that an evolving system's sophistication, complex- 
ity, and functional diversity derive from its hierarchical or- 
ganization. 

Modeling Pre-Biology 

Prior to the existence of highly sophisticated entities acted 
on by evolutionary forces, replicative objects relied on far 
more basic mechanisms for maintenance and growth. How- 
ever, these objects managed to transform, not only them- 
selves, but also indirectly the very transformations by which 
they changed (Rossler, 1979 1 in order to eventually support 
the mechanisms of natural selection. How did the transition 
from raw interaction to evolutionary change take place? Is it 
possible to pinpoint generic properties, however basic, that 
would have enabled a system of simple interacting objects 
to take the first few steps towards biotic organization? 

To explore these questions in terms of structural com- 
plexity we developed a theoretical model borrowing from 
computation theory ( |Hopcr oft and Ullman, 197 9| and 
computational mechanics jCrutchfield an d Young, 1989| 
[Crutch field, 1994|l. In this system — the finitary process 
soup ( .Crutchfield and Gornerup, 2006| ) — elementary ob- 
jects, as represented by e-machines, interact and generate 
new objects in a well stirred flow reactor. 

Choosing e-machines as the interacting, replicating 
objects, it turns out, brings a number of advantages. Most 
particularly, there is a well developed theory of their 
structural properties found in the framework of computa- 
tional mechanics. In contrast with individuals in previous, 
related pre-biotic models — such as machine language pro- 
grams ^Rasmussen et al., 1990| iRasmussen et al., 1992] 

|Ray, 1991| |Adami and Brown, 1994>, tags 

dFarmeretal., 19861 I Bagley et al.,l989i l, A.- 

expressions ( |Fontana, 1991] ^ and cellular automata 



(|Crutchfield and Mitchel l, 1995t , e-machines have a well 
defined (and calculable) notion of structural complexity. 
For the cases of machine language and A,-calculus, in 
contrast, it is known that algorithms do not even exist 
to calculate such properties since these representations 
are computation universal (Brookshear, 1989} . Another 
important distinction with prior pre-biotic models is that 
the individuals in the finitary process soup do not have 
two separate modes of operation — one of representation or 
storage and one for functioning and transformation. The 
individuals are simply objects whose internal structure 
determines how they interact. The benefit of this when 
modeling prebiotic evolution is that there is no assumed 
distinction between gene and protein ( |Schrodinger, 1967) 
|von Neuma nn, 1966j or between data and program 
(fRasmuss en et al., 1990| |Rasmussen et al., 1992| 

|Ray, 1991(p^dami and Brown, 1994> . 

£-Machines 

Individuals in the finitary process soup are objects that 
store and transform information. In the vocabulary 
of information theory they are communication channels 
( |Cover and Thomas, 199 It . Here we focus on a type 
of finite-memory channel, called a finitary e-machine, as 
our preferred representation of an evolving information- 
processing individual. To understand what this choice cap- 
tures we can think of these individuals in terms of how they 
compactly describe stochastic processes. 

A process is a discrete-valued, discrete-time stationary 
stochastic information source ( .Cover and Tho mas, 1991 1. A 
process is most directly described by the bi-infinite sequence 
it produces of random variables St over an alphabet : 

S=...S,-iS,S,+i... (1) 

and the distribution P{S) over those sequences. At each mo- 
ment t, we think of the bi-infinite sequence as consisting of 

a history Sr and a future Sr subsequence: S^SrSt- 

A process stores information in its set S of causal states. 
Mathematically, these are the members of the range of the 

map e : S ^ 2 ^ from histories to sets of histories 

e(V) = {V'|P(5 I 5- V) ^ P{s I 5- V')} , (2) 

where 2^ is the power set of histories S. That is, the causal 
state 5 of a history s is the set of histories that all have 
the same probability distribution of futures. The transition 
from one causal state Si to another Sj while emitting the 
symbol s € Jl is given by a set of labeled transition matrices: 
<r = {T;.^"' : s e .1^ }, in which 

Tl-f=P{s'^Sj,t=s\s=Si), (3) 




0|1 1|1 



Figure 1: Example e-machines: has a single causal state 
and, according to its transition labels, is the identity func- 
tion. Tb consists of causal states A and B and two transitions. 
Tb accepts two input strings, either 1010 ... or 0101 . . ., and 
flips Os to Is and vice versa as it produces an output string. 
Note that the function's domain and range are the same. Tc 
has the same domain and range as Tb, but does not exchange 
Os and Is. 



^1 

where S is the current casual state, s' its successor, and S 
the next symbol in the sequence. 

A process' e-machine is the ordered pair {5,1'}. Finitary 
e-machines are stochastic finite-state machines with the fol- 
lowing properties ( |Crutchfield and Young, 1989) : (i) All re- 
current states form a single strongly connected component, 
(ii) All transitions are deterministic in the specific sense that 
a causal state together with the next symbol determine a 
unique next state, (iii) The set of causal states is finite and 
minimal. 

In the finitary process soup we use the alphabet A = 
{0|0,0| 1, 1 10, 1 1 1} consisting of pairs in \ out of input and 
output symbols over a binary alphabet « — {0,1}. When 
used in this way e-machines read in strings over ® and emit 
strings over « . Accordingly, they should be viewed as map- 
pings from one process 5 input to another 5 output- They are, in 
fact, simply functions, each with a domain (the set of strings 
that can be read) and with a range (the set of strings that can 
be produced). In this way, we consider e-machines as mod- 
els of objects that store and transform information. In the 
following we will take the transitions from each causal state 
to have equal probabilities. Figure^shows several examples 
of simple e-machines. 

Given that e-machines are transformations, one can ask 
how much processing they do — how much structure do they 
add to the inputs when producing an output? Due to the 
properties mentioned above, one can answer this question 
precisely. Ignoring input and output symbols, the state- 
to-state transition probabiUties are given by an e-machine's 
stochastic connection matrix: T = X^i-ej? T^''\ The causal- 
state probability distribution ps is. given by the left eigen- 
vector of T associated with eigenvalue 1 and normalized in 
probability. If M is an e-machine, then the amount of in- 
formation storage it has, and can add to an input process, is 
given by M's structural complexity 

C^(M) = -^p,(v)log2/,,(v). (4) 




Figure 2: Interaction network for the e-machines of Fig. ^ 
It is a meta-machine. 



e-Machine Interaction 

e-machines interact by functional composition. Two ma- 
chines Ta and Tg that act on each other result in a third 
Tc = TboTa, where 7c (i) has the domain of Ta and the range 
of Tb and (ii) is minimized. If Ta and Tb are incompatible, 
e.g., the domain of Tb does not overlap with the range of 
Ta, the interaction produces nothing — it is considered elas- 
tic. During composition the size of the resulting e-machine 
can grow very rapidly (geometrically): |7c| < \Tb\ x \Ta\. 

Interaction Network 

We monitor the interactions of objects in the soup via the 
interaction network g . This is represented as a graph whose 
nodes correspond to e-machines and whose transitions cor- 
respond to interactions. If Ti, — Tj o 7) occurs in the soup, 
then the edge from 7) to 7i is labeled Tj. One can represent 
g with the binary matrices: 




ifr^ = r,o7;- 

otherwise. 



(5) 



For the set of e-machines in Fig. ^ for example, we have 
the interaction graph shown in Fig. |2]that is given by the 
matrices: 
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To measure the diversity of interactions in a population 
we define the interaction network complexity 



ft-fj,A>0 



where 



fifi, 
0, 



Tk = Tj o Ti has occurred, 
otherwise , 



(6) 



(7) 
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is a normalizing factor, and // is the fraction 
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Figure 3: The meta-machine to which that in Fig. [^decays 
under the population dynamics of Eq. (|8}. 



interest in actual reproduction pathways, we consider only 
those that have occurred in the soup. 

Meta-Machines 

Given a population T of e-machines, we define a meta- 
machine i2 C to be a connected set of e-machines that is 
invariant under composition. That is, is a meta-machine if 
and only if (i) 7} o 7] e i2 for all 7^ , Tj e i2, (ii) for all Tk G H, 
there exists 7) , G £1 such that Tk — Tj o Ti, and (iii) there 
is a nondirected path between every pair of nodes in i2's in- 
teraction network Qq. The interactions in Fig. [^describe a 
meta-machine of Fig. ^s e-machines. 

The meta-machine captures the notion of a self- 
replicating and autonomous entity and is consistent with 
Maturana and Varela's autopoietic set ( jVarela et al., 1974) , 
Eigen and Schuster's hypercycle ( [Schuster, \911) and 
Fontana and Buss' organization ( |Fontana and Buss, 1996t . 

Population Dynamics 

We employ a continuously stirred flow reactor with an influx 
rate <!>,„ that consists of a population ¥ of N e-machines. 
The dynamics of the population is iteratively ruled by com- 
positions and replacements as follows: 

1. e-machine Generation: 

(a) With probability <!>„, generate a random e-machine Tr 
(influx). 

(b) With probability 1 — <!>,„ (reaction): 

i. Select Ta and Tb randomly. 

ii. Form the composition Tc — Tb o Ta- 

2. e-machine Outflux: 

(a) Select an e-machine 7d randomly from the population. 

(b) Replace 7b with either Tc or Tr. 

Below, Tr will be uniformly sampled from the set of all 
two-state e-machines. This set is also used when initializing 
the population. The insertion of Tr corresponds to the influx 
while the removal of 7b corresponds to the outflux. The lat- 
ter keeps the population size constant. Note that there is no 
spatial dependence in this model; e-machines are picked uni- 
formly from the population for each replication. The finitary 
process soup here is a well stirred gas of reacting objects. 

When there is no influx (<I>,„=0) and the population is 
closed with respect to composition, the population dynamics 
is described by a finite-dimensional set of equations: 



of e-machine type / in the soup. In order to emphasize our 
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(8) 



Figure 4: (a) Population-averaged e-machine complexity 
{Cf,{T)) and (b) run-averaged interaction network complex- 
ity (Cu((^)) as a function of time t and influx rate <!>,■„ for a 
population of N = 100 objects. (Reprinted with permission 
from ( jCrutchfield and G5merup, 2006) .) 

where f/ is the frequency of 8-machine type k at time t and 
is a normalization factor 
In addition to capturing the notion of self-repUcating en- 
tities, meta-machines also describe an important type of in- 
variant set of the population dynamics. Formally, we have 

Q.^goQ.. (9) 

These invariant sets can be stable or unstable under the pop- 
ulation dynamics. Note that the meta-machine of Fig. |2l 
is unstable: only Ta produces 7^s. As such, over time 
the population dynamics will decay to the meta-machine of 
Fig. |3j which describes a soup consisting only of Tbs and 
Tcs. This example also happens to illustrate that copying — 
implemented here by the identity object Ta — need not dom- 
inate the population and so does not have to be removed by 
hand, as done in several prior pre-biotic models. It can decay 
away due to the intrinsic population dynamics. 

Simulations 

A system constrained by closure forms one useful base case 
that allows for a straightforward analysis of the popula- 
tion dynamics. It does not permit, however, for the in- 
novation of structural novelties in the soup on either the 
level of individual objects (e-machines) or on the level of 
their interactions. What we are interested in is the pos- 
sibility of open-ended evolution of e-machines and their 
meta-machines. When enabled as an open system, both 
with respect to composition and influx, the soup consti- 
tutes a constructive dynamical system and the population 
dynamics of Eq. (|8j do not strictly apply. (The open- 
ended population dynamics of epochal evolution is required 
l |Crutchfield and van Nimwegen, 2000).) 

We first set the influx rate to zero in order to study 
dynamics that is ruled only by compositional transforma- 
tions. One important first observation is that almost the 
complete set of machine types that are represented in the 
soup's initial random population is replaced over time. 
Thus, even at the earliest times, the soup generates gen- 
uine novelty. The population-averaged individual complex- 
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Figure 5: Meta-machine decomposition in a closed soup: 15 
separate runs with = 500. While the minimal 4-element 
meta-machine Q.4 (shown) dominates the soup, C^{g) is 
bounded by 4 bits. Once outflux removes one of its e- 
machines, rapidly ^4 decays to Q.2, a 2-element meta- 
machine (shown). (^4 does not contain a sub-meta-machine 
of 3 e-machines.) At this point C^{q) is bounded by 2 bits. 
After some period of time, ^2 decays to Hi, a single self- 
reproducing e-machine (shown), and Ci_i{q) is fixed at 0. 

ity increases initially, as Fig. Ela) ('I'in ~ 0) from 

( |Crutchfield and Gornerup, 2006^ shows. The e-machines 
are to some extent shaped by the selective pressure coming 
from outflux and by geometric growth due to composition. 
The turn-over is due to the dominance of nonreproducing 
e-machines in the initial population. (Cu(r)} subsequently 
declines since it is favorable to be simple as it takes a more 
extensive stochastic search to find reproductive interactions 
that include more complex e-machines. 

Note (Fig. Iljb), <I>in « 0) that the run-averaged interac- 
tion complexity (Q,((^ )) reaches a significantly higher value 
than (Cu(r)), implying that the population's structural com- 
plexity derives from its network of interactions rather than 
the complexity of its constituent individuals. (Cu((^)) con- 
tinues to grow while compositional paths are discovered 
and created. A maximum is eventually reached after which 
(Ct,((^)) declines and settles down to zero when one sin- 
gle type of self -reproducing e-machine takes over the whole 
population. 

By monitoring the individual run values of C^((^ ) rather 
than the ensemble average, one sees that they form plateaus 
as shown in Fig. |5] The plateaus — at Cu(t?) = 4 bits and, 
most notably, at Cu(t? ) —2 bits and bits — are determined 
by the largest meta-machine that is present at a given time. 
Being a closed set, the meta-machine does not allow any 
novel e-machines to survive and this gives the upper bound 
on Cu(t? ). As one e-machine type is removed from £2 by the 
outflux, the meta-machine decomposes and the upper bound 
oviCfi{g) lowers. This produces a stepwise and irreversible 
succession of meta-machine decompositions. 

Thus, in the case of zero influx, one sees that the soup 
moves from one extreme to another. It is completely dis- 
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Figure 6: Meta-machine hierarchy of dynamical composi- 
tion and decomposition. Dots denote e-machines. An iso- 
lated dot denotes a self-replicating 8-machine. Solid lines 

denote Ta Tc transitions. Dashed lines denote Tb Tc 
transitions. Although all possible transitions are used by the 
meta-machines shown, they are represented in a simplified 
way according to £^4; cf. Fig. |5] 



ordered initially, generates structural complexity in its indi- 
viduals and in its interaction network, runs out of resources 
(poorly reproducing e-machines that are consumed by out- 
flux), and decomposes down to a single type of simple self- 
reproducing e-machine. 

Although Fig. |5] shows only three plateaus, there is in 
principle one plateau for every meta-machine that at some 
point is the largest one generated by the soup. The diagram 
in Fig. |6lsummarizes our results from a more extensive and 
systematic survey of meta-machine hierarchies from a series 
of runs with — 500. It gives one illuminating example of 
how the soup spontaneously generates hierarchies of meta- 
machines. 

Leaving closed soups behind, we now investigate the ef- 
fects of influx. Recall the population-averaged e-machine 
complexity (Cu(r)} and the run-averaged interaction net- 
work complexity {C^{g )) as a function of t and <!>,„ shown 
in Fig. 13 Over time, (Cu(r)) behaves similarly for <!>,„ > 
as it does when <!>,„ ~ 0. It increases rapidly initially, 
reaches a peak, and declines to a steady state. Notably, 
the emergence of complex organizations of interaction net- 
works occurs where the average structural complexity of the 
e-machines is low. Stationary (C^(r)) is instead maximized 
at a relatively high influx rate (0,>, « 0.75) at which (Cu((^ )) 
is small compared to its maximum. As <!>,„ is increased, so 
is {Cfj{g)) at large times. {C^{g)) is maximized around 



<!>,„ « 0.1. For higher influx rates, individual novelty has 
a deleterious effect on the sophistication of a population's 
interaction network. Existing reproductive paths do not per- 
sist due to the low rate of successful compositions of highly 
structured (and so specialized) individuals. We found that 
the maximum network complexity C^,{(^) grows slowly and 
linearly over time at « 7.6 ■ lO^'* bits/replication. 

Summary and Conclusions 

To understand the basic mechanisms driving the evolution- 
ary emergence of structural complexity in a quantitative and 
tractable pre-biotic setting, we investigated a well stirred 
soup of e-machines (finite-memory communication chan- 
nels) that react with each other by composition and so gen- 
erate new e-machines. When the soup is open with respect 
to composition and influx, it spontaneously builds structural 
complexity on the level of transformative relations among 
the e-machines rather than in the e-machine individuals 
themselves. This growth is facilitated by the use of relatively 
non-complex individuals that represent general and elemen- 
tary local functions rather than highly specialized individu- 
als. The soup thus maintains local simplicity and general- 
ity in order to build up hierarchical structures that support 
global complexity. Novel computational representations are 
intrinsically introduced in the form of meta-machines that, 
in turn, are interrelated in a hierarchy of composition and 
decomposition. Computationally powerful local representa- 
tions are thus not necessary (nor effective) in order for the 
emergence and growth of complex replicative processes in 
the finitary process soup. Meta-machines in closed soups 
eventually decay. For Cjj{g) to maintain and grow the soup 
must be fed with novel material in the form of random e- 
machines. Otherwise, any spontaneously generated meta- 
machines are decomposed (due to finite-population sam- 
pling) and the population eventually consists of a single type 
of trivially self-reproducing e-machine. At an intermediate 
influx rate, however, the interaction network complexity is 
not only maintained but grows linearly with time. This, then, 
suggests the possibility of open-ended evolution of increas- 
ingly sophisticated organizations. 
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